Boosting with the L 2 -loss: Regression and Classiication

نویسنده

  • Bin Yu
چکیده

This paper investigates a computationally simple variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2-loss function. As other boosting algorithms, L 2 Boost uses many times in an iterative fashion a pre-chosen tting method, called the learner. Based on the explicit expression of reetting of residuals of L 2 Boost, the case with (symmetric) linear learners is studied in detail in both regression and classiication. In particular, with the boosting iteration m working as the smoothing or regularization parameter, a new exponential bias-variance trade oo is found with the variance (complexity) term increasing very slowly as m tends to innnity. When the learner is a smoothing spline, an optimal rate of convergence result holds for both regression and classiication and the boosted smoothing spline even adapts to higher order, unknown smoothness. Moreover, a simple expansion of a (smoothed) 0-1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classiication. Finally, simulation and real data set results are obtained to demonstrate the attractiveness of L 2 Boost. In particular, we demonstrate that L 2 Boosting with a novel component-wise cubic smoothing spline is both practical and eeective in the presence of high-dimensional predictors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Weak Base Learners for Boosting Regression and Classiication on Weak Base Learners for Boosting Regression and Classiication

The most basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generate weak hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introducing a geometric concept called the an...

متن کامل

Boosting Regression via Classiication

Boosting strategies are methods of improving the accuracy of a prediction (a classii-cation rule) by combining many weakerr predictions, each of which is only moderately accurate. In this paper we present a concise analysis of the Freund and Shapire's AdaBoost algorithm FS97] from which we derive a new boosting strategy for the regression case which is an extension of the algorithm discussed in...

متن کامل

Large Time Behavior of Boosting

1 2 We exploit analogies between regression and classiication to study certain properties of boosting algorithms. A geometric concept called the angular span is deened and related to analogs of the VC dimension and the pseudo dimension of the regression and classiication systems, and to the assumption of the weak learner. The exponential convergence rates of boosting algorithms are shown to be ...

متن کامل

Some Results on Weakly Accurate Base Learners for Boosting Regression and Classification

One basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generatèweak' (or more appropriately, `weakly accurate') hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introdu...

متن کامل

Improving the Performance of Boosting for Naive Bayesian Classification

This paper investigates boosting naive Bayesian classiica-tion. It rst shows that boosting cannot improve the accuracy of the naive Bayesian classiier on average in a set of natural domains. By analyzing the reasons of boosting's failures, we propose to introduce tree structures into naive Bayesian classiication to improve the performance of boosting when working with naive Bayesian classiicati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002